{
 "cells": [
  {
   "cell_type": "markdown",
   "metadata": {},
   "source": [
    "<a href=\"https://colab.research.google.com/github/jeffheaton/t81_558_deep_learning/blob/master/t81_558_class_04_5_rmse_logloss.ipynb\" target=\"_parent\"><img src=\"https://colab.research.google.com/assets/colab-badge.svg\" alt=\"Open In Colab\"/></a>"
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {},
   "source": [
    "# T81-558: Applications of Deep Neural Networks\n",
    "**Module 4: Training for Tabular Data**\n",
    "* Instructor: [Jeff Heaton](https://sites.wustl.edu/jeffheaton/), McKelvey School of Engineering, [Washington University in St. Louis](https://engineering.wustl.edu/Programs/Pages/default.aspx)\n",
    "* For more information visit the [class website](https://sites.wustl.edu/jeffheaton/t81-558/)."
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {},
   "source": [
    "# Module 4 Material\n",
    "\n",
    "* Part 4.1: Encoding a Feature Vector for Keras Deep Learning [[Video]](https://www.youtube.com/watch?v=Vxz-gfs9nMQ&list=PLjy4p-07OYzulelvJ5KVaT2pDlxivl_BN) [[Notebook]](https://github.com/jeffheaton/t81_558_deep_learning/blob/master/t81_558_class_04_1_feature_encode.ipynb)\n",
    "* Part 4.2: Keras Multiclass Classification for Deep Neural Networks with ROC and AUC [[Video]](https://www.youtube.com/watch?v=-f3bg9dLMks&list=PLjy4p-07OYzulelvJ5KVaT2pDlxivl_BN) [[Notebook]](https://github.com/jeffheaton/t81_558_deep_learning/blob/master/t81_558_class_04_2_multi_class.ipynb)\n",
    "* Part 4.3: Keras Regression for Deep Neural Networks with RMSE [[Video]](https://www.youtube.com/watch?v=wNhBUC6X5-E&list=PLjy4p-07OYzulelvJ5KVaT2pDlxivl_BN) [[Notebook]](https://github.com/jeffheaton/t81_558_deep_learning/blob/master/t81_558_class_04_3_regression.ipynb)\n",
    "* Part 4.4: Backpropagation, Nesterov Momentum, and ADAM Neural Network Training [[Video]](https://www.youtube.com/watch?v=VbDg8aBgpck&list=PLjy4p-07OYzulelvJ5KVaT2pDlxivl_BN) [[Notebook]](https://github.com/jeffheaton/t81_558_deep_learning/blob/master/t81_558_class_04_4_backprop.ipynb)\n",
    "* **Part 4.5: Neural Network RMSE and Log Loss Error Calculation from Scratch** [[Video]](https://www.youtube.com/watch?v=wmQX1t2PHJc&list=PLjy4p-07OYzulelvJ5KVaT2pDlxivl_BN) [[Notebook]](https://github.com/jeffheaton/t81_558_deep_learning/blob/master/t81_558_class_04_5_rmse_logloss.ipynb)"
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {},
   "source": [
    "# Google CoLab Instructions\n",
    "\n",
    "The following code ensures that Google CoLab is running the correct version of TensorFlow."
   ]
  },
  {
   "cell_type": "code",
   "execution_count": 1,
   "metadata": {},
   "outputs": [
    {
     "name": "stdout",
     "output_type": "stream",
     "text": [
      "Note: not using Google CoLab\n"
     ]
    }
   ],
   "source": [
    "try:\n",
    "    %tensorflow_version 2.x\n",
    "    COLAB = True\n",
    "    print(\"Note: using Google CoLab\")\n",
    "except:\n",
    "    print(\"Note: not using Google CoLab\")\n",
    "    COLAB = False"
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {},
   "source": [
    "# Part 4.5: Error Calculation from Scratch\n",
    "\n",
    "We will now look at how to calculate RMSE and logloss by hand.  RMSE is typically used for regression. We begin by calculating RMSE with libraries."
   ]
  },
  {
   "cell_type": "code",
   "execution_count": 2,
   "metadata": {},
   "outputs": [
    {
     "name": "stdout",
     "output_type": "stream",
     "text": [
      "Score (MSE): 0.14200000000000007\n",
      "Score (RMSE): 0.37682887362833556\n"
     ]
    }
   ],
   "source": [
    "from sklearn import metrics\n",
    "import numpy as np\n",
    "\n",
    "predicted = [1.1,1.9,3.4,4.2,4.3]\n",
    "expected = [1,2,3,4,5]\n",
    "\n",
    "score_mse = metrics.mean_squared_error(predicted,expected)\n",
    "score_rmse = np.sqrt(score_mse)\n",
    "print(\"Score (MSE): {}\".format(score_mse))\n",
    "print(\"Score (RMSE): {}\".format(score_rmse))\n"
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {},
   "source": [
    "We can also calculate without libraries."
   ]
  },
  {
   "cell_type": "code",
   "execution_count": 3,
   "metadata": {},
   "outputs": [
    {
     "name": "stdout",
     "output_type": "stream",
     "text": [
      "Score (MSE): 0.14200000000000007\n",
      "Score (RMSE): 0.37682887362833556\n"
     ]
    }
   ],
   "source": [
    "score_mse = ((predicted[0]-expected[0])**2 + (predicted[1]-expected[1])**2 \n",
    "+ (predicted[2]-expected[2])**2 + (predicted[3]-expected[3])**2\n",
    "+ (predicted[4]-expected[4])**2)/len(predicted)\n",
    "score_rmse = np.sqrt(score_mse)\n",
    "    \n",
    "print(\"Score (MSE): {}\".format(score_mse))\n",
    "print(\"Score (RMSE): {}\".format(score_rmse))"
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {},
   "source": [
    "## Classification\n",
    "\n",
    "We will now look at how to calculate a logloss by hand. For this, we look at a binary prediction. The predicted is some number between 0-1 that indicates the probability true (1). The expected is always 0 or 1. Therefore, a prediction of 1.0 is completely correct if the expected is 1 and completely wrong if the expected is 0."
   ]
  },
  {
   "cell_type": "code",
   "execution_count": 4,
   "metadata": {
    "scrolled": true
   },
   "outputs": [
    {
     "name": "stdout",
     "output_type": "stream",
     "text": [
      "0.06678801305495843\n"
     ]
    }
   ],
   "source": [
    "from sklearn import metrics\n",
    "\n",
    "expected = [1,1,0,0,0]\n",
    "predicted = [0.9,0.99,0.1,0.05,0.06]\n",
    "\n",
    "print(metrics.log_loss(expected,predicted))"
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {},
   "source": [
    "Now we attempt to calculate the same logloss manually."
   ]
  },
  {
   "cell_type": "code",
   "execution_count": 5,
   "metadata": {},
   "outputs": [
    {
     "name": "stdout",
     "output_type": "stream",
     "text": [
      "Score Logloss 0.06678801305495843\n"
     ]
    }
   ],
   "source": [
    "import numpy as np\n",
    "\n",
    "score_logloss = (np.log(1.0-np.abs(expected[0]-predicted[0]))+\\\n",
    "np.log(1.0-np.abs(expected[1]-predicted[1]))+\\\n",
    "np.log(1.0-np.abs(expected[2]-predicted[2]))+\\\n",
    "np.log(1.0-np.abs(expected[3]-predicted[3]))+\\\n",
    "np.log(1.0-np.abs(expected[4]-predicted[4])))\\\n",
    "*(-1/len(predicted))\n",
    "\n",
    "print(f'Score Logloss {score_logloss}')"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": null,
   "metadata": {},
   "outputs": [],
   "source": []
  }
 ],
 "metadata": {
  "anaconda-cloud": {},
  "kernelspec": {
   "display_name": "Python 3.9 (tensorflow)",
   "language": "python",
   "name": "tensorflow"
  },
  "language_info": {
   "codemirror_mode": {
    "name": "ipython",
    "version": 3
   },
   "file_extension": ".py",
   "mimetype": "text/x-python",
   "name": "python",
   "nbconvert_exporter": "python",
   "pygments_lexer": "ipython3",
   "version": "3.9.7"
  }
 },
 "nbformat": 4,
 "nbformat_minor": 4
}
